Differentia compositionem facit. A Slower-Paced and Reliable Parser for Latin

نویسندگان

  • Edoardo Maria Ponti
  • Marco Passarotti
چکیده

The Index Thomisticus Treebank is the largest available treebank for Latin; it contains Medieval Latin texts by Thomas Aquinas. After experimenting on its data with a number of dependency parsers based on different supervised machine learning techniques, we found that DeSR with a multilayer perceptron algorithm, a right-to-left transition, and a tailor-made feature model is the parser providing the highest accuracy rates. We improved the results further by using a technique that combines the output parses of DeSR with those provided by other parsers, outperforming the previous state of the art in parsing the Index Thomisticus Treebank. The key idea behind such improvement is to ensure a sufficient diversity and accuracy of the outputs to be combined; for this reason, we performed an in-depth evaluation of the results provided by the different parsers that we combined. Finally, we assessed that, although the general architecture of the parser is portable to Classical Latin, yet the model trained on Medieval Latin is inadequate for such purpose.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Engineering in Persian Dependency Parser

Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...

متن کامل

Examining fatigue in COPD: development, validity and reliability of a modified version of FACIT-F scale

INTRODUCTION Fatigue is a disruptive symptom that inhibits normal functional performance of COPD patients in daily activities. The availability of a short, simple, reliable and valid scale would improve assessment of the characteristics and influence of fatigue in COPD. METHODS At baseline, 2107 COPD patients from the ECLIPSE cohort completed the Functional Assessment of Chronic Illness Thera...

متن کامل

Mindfulness-Based Interventions for parents of slow-paced people: A Systematic review and meta-analysis

Objective: Mindful-based interventions have been the focus of attention among the third wave of cognitive-behavioral followers, in recent years. A variety of interventional techniques such as acceptance, emotional control, cognitive restructuring are increasingly incorporated into the interventional framework directed to help slow-paced population parents and caregivers. Method: The present stu...

متن کامل

Constructing a Parser for Latin

We describe the construction of a grammar and lexicon for Latin in the AGFL formalism, in particular the generation of the lexicon by means of transduction and the description of the syntax using the Free Word Order operator. From these two components, an efficient TopDown chart parser is generated automatically. We measure the lexical and syntactical coverage of the parser and describe how to ...

متن کامل

Morphological parser for Latin

Morphology describes how words are formed in a language, for example by adding suffixes or prefixes to existing words. In some languages, this process is very productive, and it is thus important for computational linguistics to be able to handle this. The purpose of a morphological parser is to extract information from the morphological structure of a word. In this paper, we examine this probl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016